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ABSTRACT 

We present a fast and user friendly exoplanet transit light curve modelling package PyTransit, 
implementing optimised versions of the [Gimenez ( 2006| l and Mandel & Agol ( |2002| l transit 
models. The package offers an object-oriented Python interface to access the two models im¬ 
plemented natively in Fortran with OpenMP parallelisation. A partial OpenCL version of the 
quadratic Mandel-Agol model is also included for GPU-accelerated computations. The aim 
of PyTransit is to facilitate the analysis of photometric time series of exoplanet transits con¬ 
sisting of hundreds of thousands of datapoints, and of multi-passband transit light curves from 
spectrophotometric observations, as a part of a researchers programming toolkit for building 
complex, problem-specihc, analyses. 
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1 INTRODUCTION 

The rapid increase in computational power during the last decades 
has allowed for the adoption of increasingly robust statistical meth¬ 
ods in the analysis of astrophysical data. Specially, combining a 
fully Bayesian approach to inference with Markov Chain Monte 
Carlo (MCMC) sampling for the posterior estimation has allowed 
for improved characterisation of model parameter uncertainties in 
parameter estimation problems, while the adoption of Bayesian 
model selection has given us the tools to robustly judge between 
many competing hypotheses aiming to explain our observational 
data. The increased flexibility and robustness come with a price. 
While the methods allow us to work with complex models with a 
high number of dimensions (free parameters), increasing dimen¬ 
sionality quickly increases the number of likelihood evaluations 
required for the analysis. Further, while the computers keep get¬ 
ting faster, also the size of the observational datasets (and the re- 
serachers’ ambitions) keep growing thanks to the advancements in 
instrumentation and observation techniques. 

In the analysis of photometric times series of exoplanet tran¬ 
sits (transit light curves), the size and complexity of the observa¬ 
tional datasets has increased due to the introduction of space-based 
telescopes CoRoT and Kepler, observing potentially hundreds of 
individual transits for a single transiting planetQdue to introduc¬ 
tion of lucky-imaging techniques allowing for very high time reso¬ 
lution observations; and due to the maturing of spectrophotometry 
as a transit observation method. 


* And, while the Kepler long-cadence (LC) mode produces a relatively 
small number of exposures per transit, the modelling of long cadence data 
requires model supersampling to account for the extended integration time 
jKipping|2010j. 


The space-based telescopes and lucky-imaging cameras pro¬ 
duce light curves with tens to hundreds of thousands exposures, 
while the ground-based spectrophotometric observations of indi¬ 
vidual transits yield a smaller number of exposures, but with an 
additional dimension (number of passbands extracted from the ob¬ 
served spectra) to our time series. Both the stellar limb darkening 
and the planetary radius vary as a function of the wavelength cov¬ 
erage of the passband, and in typical cases the dimensionality of 
the parameter space increases by 3-5 free parameters per passband 
(radius ratio, at least two limb darkening parameters, assuming we 
are not overly reliant on the theoretical limb darkening coefficients, 
and 1-2 parameters to model the baseline flux). This increase in di¬ 
mensionality leads to a significant increase in the number of likeli¬ 
hood evaluations needed to obtain a representative posterior sample 
using MCMC techniques, and, thus, an analysis of a single spec¬ 
trophotometric transit can require equal amounts of computation 
resources as an analysis of a Kepler light curve covering hundreds 
of transits. 

The transit shape model forms the core of the forward model 
in the transit light curve analysis. The model describes the depen¬ 
dence of the observed flux as a function of stellar limb darkening, 
planet-star radius ratio, and the planet-star distance (from centre to 
centre, expressed in stellar radii.) The field of transit light curve 
modelling is largely dominated by two analytical approaches: a set 
of transit models for different stellar limb darkening parametrisa- 


tions by|Mandel & Agol|i 

2002|l, and a versatile series-expansion- 

based model presented by 

Gimenez J2006l. Both the Gimenez and 


Mandel-Agol models (henceforth G and MA models, respectively) 
allow the most computationally intensive calculations to be fac¬ 
tored out from the effects due to limb darkening, accelerating the 
calculation of multiple simultaneously observed passbands with 
different limb darkening coefficients significantly. 
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Here we present a Python package offering a straight-forward 
way to access the |Gimenez] ( |2006^ and [Mandel & Agol| j2002^ 
transit models. The package has been used for transit light curve 
modelling in nine peer-reviewed publications over four years, and 
can be considered production ready. The models are implemented 
in Fortran 2003 (based on the original FORTRAN?? implementa¬ 
tions by the respective authors) with OpenMP multithreading and 
model-specific optimisations aimed to minimise the model evalua¬ 
tion cost. Both models can be computed exactly, or using interpola¬ 
tion for improved speed, and a partial OpenCL implementation of 
the interpolated quadratic Mandel-Agol model is included for GPU 
computing. The package includes the necessary utility routines to 
calculate circular and elliptic orbits (using either Newton’s method, 
iteration, or two series approximations), transit durations, eclipse 
centres, etc., and offers a simple interface combining the orbit and 
the transit model computations that selects the most appropriate 
orbit calculation routine based on the eccentricity. Examples and 
tutorials on using the code are included in the package and online|^ 

While the MA and G models are the two most commonly used 
approaches for transit modelling, both more generic and specialised 
modelling tools exists. [Abubekerov & Gost^ ( |2013[ > have derived 
analytical expressions for the transit shape (and its derivatives, also 
derived by|Pal| j2008| for the quadratic MA model) for a wider set 
of limb-darkening models than offered by |Mandel & Agol] and of¬ 
fer an example implementation written in C. The JKTEBOP pack¬ 
age by |Southworth| j2008| > offers a versatile numerical approach 
to transit modelling where both the host star and the planet are 
modelled as biaxial spheroids. This goes beyond basic transit shape 
model, allowing for the modelling of the reflection and ellipsoidal 
effects as a function of orbital phase. [Bames| ( |2009l l has introduced 
a numerical approach for modelling transits over rapidly-rotating 
stars with significant gravity-darkening due to stellar oblateness. 
The transits over rapidly rotating stars can be use to probe for spin- 
orbit misalignment, since misaligned orbits will show asymmetric 
transits. Einally, |Pal| \1Q12) have considered the problem of mod¬ 
elling mutual transits in multi-planet systems, something the MA 
or G models cannot be used for directly. 

PyTransit aims to offer a Pythonic access to the tools for one 
part of the planet characterisation problem: the modelling of the 
flux decrement due to an occulting planet as a function planet-star 
distance, planet-star radius ratio and stellar limb darkening. Several 
other transit modelling packages offer similar functionality with 
their o wn advantages and limitations. EXOFAST by |Eastman et al. 

1 2013 I is an IDL library for transit and radial velocity modelling|^ 
The authors gain significant improvements in the evaluation speed 
of the quadratic MA model by swapping the original method for 
computing the elliptic integral of the third kind with a faster one. 
However, the use of IDL, while still relatively popular in astro¬ 
physics, limits the package’s adoptability]^ 


^ See the notebooks directory from https://github.com/ 
hpparvi/PyXransit for IPython notebook examples, and https: 

/ /github. com/hpparvi/exo_tutorials for more in-depth tuto¬ 
rials on exoplanet characterisation in general. 

^ With a web-front-end at http://astroutils . astronomy. 
Ohio-sta te.edu/exof ast/exofast.shtml 
■* However, [Eastman et al.| also offer Python and Fortran implementations 
of their faster MA model. The Transit Analysis Package (TAP,|Gazak et al.j 
[201 2) is an IDL transit modelling package with a graphical user interface. 
The package implements the wavelet based likelihood function byjCarter &| 
jWinnj j2009) that accounts for correlated noise (of very specific statistical 
characteristics) in the photometry, improving the robustness of the param- 


PyTransit advocates the toolkit-based approach where the 
analysis code is constructed using a set of tools best suited for the 
problem at hand. This is similar to EXOFAST (although the scope 
of the package is significantly narrower), and lower-level than what 
offered by TAP and PlanetPack. The approach offers significant 
flexibility what comes to working with different MCMC samplers, 
optimisers, implementing noise models, etc., but it also requires the 
end-user to have a slightly higher level of experience than what is 
required by the off-the-shelf analysis packages. 


2 THE TRANSIT MODELS 

2.1 The Gimenez Model 

2.1.1 Overview 

[Gimenez] j2006| [200?^ describe a versatile series-expansion based 
transit model developed originally for eclipsing binaries by|Kopal[ 
\1911) . The normalised flux, /, is expressed as 

f{k,z)=l-a{k,z), (1) 


where a stands for the fractional loss of light due to the transiting 
planet, k is the star-planet radius ratio, and z is the projected star- 
planet distance divided by the stellar radius. The a functions are 
described as 

N 

a(k, z) = 2 (b, c), (2) 

n=0 


where C„ are factors that depend only on n limb darkening coeffi¬ 
cients, b = k/(l + k), and c = z/O + k), and the a„ functions are 
expressed in terms of Jacobi polynomials, G„(p, q\ x) as 


a„(b, c) 


^ b\i-pr' 
vr(r+l) 


X Z%(-in2j + y + 2) 


r(v+;+l) 

r(j+ 2 ) 


X Gj(v + 2,v + 1; I - bf X Gj(v + 2,1; c^). 


(3) 


where v = (n + 2)/2, and F is the gamma function. 

The accuracy of the Gimenez model is determined by Nj, the 
number of Jacobi polynomials used in Eq. Figure [T] shows the 
model’s maximum and mean absolute deviations from the exact 
solution for z = [0..1 -l- fc] as a function of Nj. The figure is only il¬ 
lustrative, since the exact deviations depends on the limb darkening 
and the spanned z-values (but the dependency on limb darkening is 
small relative to the dependency on Nj.) Relatively small Nj (~40) 
can be found to be sufficient for modelling ground-based observa¬ 
tions. Also, a small Nj can be first used to first obtain a quick rough 
solution, which can then be refined by increasing Nj. 


2.1.2 Limb darkening 


The Gimenez model produces transit light curves that follow a gen¬ 
eral limb darkening law, 


_ 1 V M 

7 ( 1 )“^ ^“«(1 


(4) 


eter estimates. Finally, the latest up date of PlanetPack l|Baluev|2Q14} has 
included transit modelling using the [Abubekerov & Gostev [ transit model 
and several correlated noise models. PlanetPack is a command-line pro¬ 
gram written in C++, and thus its use has the pros and cons of a an analysis 
approach based on a single monolithic program. However, this is alleviated 
by its open-source nature combined with its independence from proprietary 
packages and languages. 
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Figure 1. Maximum and mean (black and grey lines, respectively) absolute 
deviations for the Gimenez model with two limb darkening coefficients as 
a function of number of Jacobi polynomials used. 


where N stands for the order of the model, u„ is the n'* limb dark¬ 
ening coefficient, /r = cos 7 , and 7 is the foreshortening angle (the 
angle between the surface normal and the line of sight.) The general 
law equals to linear limb darkening for = 1 , 


7(1) 


(5) 


Eq.j^can be written as 

a^{b, c) = ^^(l-c^)”' ^A„jG/v-h2, v-(-1; \-bf Gj{v+2, l;c^),(ll) 
1=0 

and A„j can be precomputed into a two-dimensional lookup table 
given the number of limb darkening coefficients and the depth of 
the series expansion in the beginning of the computations. 

Next, the Jacobi polynomials Gj can be calculated using re¬ 
cursion as 

Go(q,p;x)^ 1, 

Gi(q,p-,x)= ((2 + q +p)x + q-p)l2, (12) 

Gi+i(q, p; x) = ((ji -H j2x)Gi -I- j^Gi-i)! jA, 

where the coefficients also depend only on the expansion depth 
and number of limb darkening coefficients, and can be precom¬ 
puted into another two-dimensional lookup table. 

After A„j and j„ have been precomputed, the evaluation of 
the Gimenez model for a given radius ratio, projected distance, and 
limb darkening coefficients requires only summations and multipli¬ 
cations, making the computation of the model for a large number 
of points fast. 

2.2 The Quadratic Mandel-Agol Model 

2.2.1 Overview 


and to the quadratic limb darkening law for N = 2, 

Kp) 2 

^^ = l-a(l-p)-b(l-pf, ( 6 ) 

where the quadratic law coefficients (a, b) are transformed to the 
general law coefficients as = a + 2b and M 2 = —b. 


2.1.3 Optimisation: Simultaneous Multiple Passbands 


The limb darkening enters into the|Gimenez|model as coefficients 


1 - 

1 _ -yN nu„ ’ 
^ Zj7i=1 n+2 


(7) 


C 


n 


Uij 

1 _ nu,i ’ 


( 8 ) 


where C„ do not depend on k or z. The coefficients are cheap to 
compute, and by first calculating the a„ in Eq. the model can 
be evaluated with low computational cost for different sets of limb 
darkening coefficient vectors u, as 


N 

ai(k,z) = / C„,ia„(b,c). 


(9) 


This is beneficial for the analysis of spectrophotometric data, since 
we do not need to evaluate the full model for each separate pass- 
band. 


2.1.4 Optimisation: Precomputing the Model Coefficients 


A quick look at the Eq.|^shows that the computationally most ex¬ 
pensive parts of the equation do not depend on k or z, but only on 
the depth of the expansion and the number of limb darkening coef¬ 
ficients. If we abbreviate 


{-\y{2j + v + 2) nv + j+l) 

yr(v-H) r(y + 2) 


( 10 ) 


[Mandel & Agol| \2QQ2) introduced a set of analytical transit light 
curve models for several different limb darkening laws, of which 
PyTransit implements the uniform and quadratic model. The flux, 
/, for a transit over a stellar disk with quadratic limb darkening is 


/(fe,z) = 1- 


(1 - c)Ag{k,z) + cAci{k,z) -I- bed{k,z) 
\-al2,-bl6 


(13) 


where k is the radius ratio, z is the projected distance, c = a + 2b, 
a and b are the quadratic limb darkening coefficients, and A^, A^, 
and qj are functions that depend on k and z as defined in|Mandel &| 
|AgoI| ( | 2002 t . 


2.2.2 Optimisation: Simultaneous Multiple Passbands 

As with the |Gimenez| model, the effects from limb darkening are 
factored out from the most expensive computations (those of A^, Aj, 
and qf), which, again, allows for efficient computation of multiple 
simultaneous passbands with different limb darkening coefficients. 


2.2.3 Optimisation: Precomputing Interpolation Tables 

The functions T^, and q^ in Eq. [T^ can be precomputed into 
two-dimensional interpolation tables spanning z = [ 0 .. 1 -t k] and 
k ranges based on the prior set on k. The model evaluation can 
now be done by first interpolating the values of the three functions, 
followed by the summations, multiplications, and one division (two 
of the divisions can be replaced with multiplications) in Eq.|13| 
The maximum absolute error (here defined as the deviation 
from the non-interpolated model) for the interpolated MA model 
using the default values (ni^ = 128 and = 256) for Afc = 0.02 
(that is, we have an uniform prior on the radius ratio from ko to 
kf, + Ak) is ~4 ppm, and the average absolute deviation over the 
whole transit is 0.05 ppm. Both of these values are well below the 
limits that can be achieved Kepler or CoRoT. Decreasing Mk has a 
smaller effect on the introduced error than decreasing and even 
Mk = 8 for Ak = 0.02 yields a maximum absolute error of ~8 ppm. 
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3 IMPLEMENTATION 

3.1 Overview 

The transit models are implemented in Fortran 2003 based on the 
original FORTRAN?? implementations by |Gimenez| and |Mandel| 
|& Agol| The shared-memory parallelisation is carried out using 
OpenMP. The Python package offers easy-to-use Python classes 
wrapping the Fortran models with a common interface for both 
models. The Python interface offers automatic model supersam¬ 
pling given the number of subsamples and integration time, and 
model interpolation in z-space is implemented for the Gimenez 
model (the use of interpolation tables for the quadratic MA model 
makes any gains of z-space interpolation insignificant.) 


3.2 Model Supersampling 

A single point in a photometric time series corresponds to an in¬ 
tegration of the flux over the exposure time. If the changes in the 
observed signal are small compared to the noise level over a single 
exposure, we can approximate this integration with a single value 
evaluated at the centre of the exposure. However, this approxima¬ 
tion fails when the exposure time is long enough to integrate over 
modelled features—such as with Kepler's, long time cadence mode 
( |Kipping|2010^ —and also the model needs to be integrated over 
the exposure. 

PyTransit offers transit model supersampling to account for 
long exposure times given the number of subsamples and the ex¬ 
posure time. The model is evaluated for n evenly distributed sub¬ 
samples inside each exposure, where n is given by the user based 
on exposure time and transit duration, and each light curve point 
corresponds to the mean of the subsamples. 

3.3 Gimenez Model Interpolation in z-Space 

While the optimisations described in Sec. |2.1.4] makes the Gimenez 
model computationally efficient, simple model interpolation in z- 
space can offer a notable speedup with light curves containing hun¬ 
dreds of thousands of points. 

The code implements an alternative interpolated mode for the 
evaluation of the Gimenez model, where the transit model is first 
evaluated for n points for z < 1 - k and m points for i — k < z < 
k+i, and the light curve points are interpolated from these tabulated 
values. The maximum absolute error (again, the difference between 
the interpolated and non-interpolated models) for the default grid 
size is ~25 ppm, and the average error is ~2 ppm. 


3.4 Partial OpenCL Acceleration of the MA model 

The package implements an OpenCL version of the interpolated 
quadratic Mandel-Agol model, where the interpolation of T^, Aj, 
and T/rf is offloaded to the GPU. Given the computational simplicity 
of bilinear interpolation used, the overheads from memory trans¬ 
fer to and from the GPU dominate the evaluation time, making 
the model evaluation in GPU significantly slower than in CPU for 
small light curves. For basic usage, the OpenCL accelerated model 
is faster than the multithreaded Fortran implementation for light 
curves with > 2 x 10^ points. However, significant speedups can 
be reached with smaller light curves if both the z and likelihood 
calculations are also offloaded to the GPU to minimise the memory 
transfer. 


3.5 Applicability to Transmission Spectroscopy 

Both the Gimenez and Mandel-Agol models allow for efficient 
evaluation for multiple simultaneously observed passbands, where 
the differences in the transit shape (and observed depth) reflect the 
differences in the stellar limb darkening in each passband. How¬ 
ever, this does nothing to include the effects from the variations in 
the effective radius ratio k, the parameter of interest when carrying 
out transmission spectroscopy. 

The effects on the transit depth by varying k are included in 
PyTransit by assuming that the relative changes in the effective ra¬ 
dius ratio are a small fraction of its average value. Now, the changes 
in k affect only the transit depth, and the changes in other observ¬ 
ables (the duration and shape of the transit) are below observation 
limits. Thus, given a set of n radius ratios, „, the model is eval¬ 
uated using the average radius ratio, k, and then multiplied for each 
passband by a correction factor PjP. 


4 PERFORMANCE 

We benchmark the model performance using two setups: 

(i) Intel Linux desktop (64 bit Ubuntu Precise), Intel i?-3??0 
(4 cores at 3.4 GHz), GFortran 4.6.3, optimisation flags -Ofast 
-march=native. 

(ii) AMD Linux desktop (64 bit Ubuntu Trusty), AMD FX-8350 
(8 cores at 2.8 GHz), GFortran 4.9, optimisation flags -Ofast 
-march=native. 

With the exception of threading, the performance scales equiva¬ 
lently for both setups (the absolute performance is also comparable 
when considering the difference in the clock rates), and we show 
the results only for the first setup. 

Figure]^ shows the absolute model evaluation times as a func¬ 
tion of the number of light curve points for the directly evaluated 
and interpolated transit models. The modelled light curves have 
35% of in-transit points and 65% of out-of-transit points, corre¬ 
sponding to a typical observation setup. The interpolation for the 
Gimenez model is carried out using linear interpolation in z-space, 
while the interpolated Mandel-Agol model uses bilinear interpola¬ 
tion with the two-dimensional interpolation tables for T^, A^ and 
77 j. The quadratic Mandel-Agol model is significantly faster than 
the two-coefficient Gimenez model, but increasing the number of 
limb darkening coefficients does not affect the performance of the 
Gimenez model significantly. The OpenCL version of the interpo¬ 
lated Mandel-Agol model is slower than the Fortran version for 
small light curves due to memory transfer, but this can be allevi¬ 
ated by offloading also the rest of the computations to the GPU. 

Figurej^shows the model evaluation times per passband rela¬ 
tive to the evaluation time for a single passband. The speedup from 
calculating all passbands simultaneously when working with spec- 
trophotometric data is obvious. 

Figure]^ shows the evaluation time as a function of OpenMP 
threads for the setup with an Intel i? processor. The multithreading 
scaling is slightly different for the two setups: the optimal perfor¬ 
mance is obtained with three threads for the Intel setup and six 
threads for the AMD setup (not shown). 
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Figure 3. Model evaluation times as a function of simultaneous passbands 
relative to the evaluation time for a single passband. Continuous lines show 
the results for the direct models, and slashed lines for the interpolated mod¬ 
els. Note that the gains for the interpolated models are smaller since the 
evaluation times for the parts not dependent on limb darkening are smaller. 


Figure 2. Model evaluation times for a single light curve with n points. 
The topmost panel shows the directly evaluated MA model (solid line, us¬ 
ing four threads), and the model inteipolated in Aj, A^ and tjj (dashed line.) 
The middle panel shows the MA model with the interpolation carried out 
using GPU (coded in OpenCL): the black solid line corresponds to model 
evaluation with two memory transfers (first copying the z array to the GPU, 
then reading the flux an'ay from the GPU); the dashed line comesponds to 
model evaluation without transferring the flux from the GPU (the likeli¬ 
hood computation is also offloaded to the GPU); and the dotted line cor¬ 
responds to model evaluation where also the z-airay calculation is done in 
GPU. The solid grey line shows the CPU-interpolated MA model results 
for reference. The lowest panel shows the model evaluation times for the 
Gimenez model. The grey solid line shows the results for the directly eval¬ 
uated quadratic model, the grey dashed line for the model interpolated in 
z-space, and the black lines correspond to the evaluation of four-parameter 
general limb darkening model. 


5 CONCLUSIONS AND DISCUSSION 

We have described a Python package offering optimised versions 
of the Gimenez and Mandel-Agol transit models. The package is 
freely available from github 

https://github.com/hpparvi/PyTransit 

and under continuing development. The Fortran routines are also 
directly usable as Fortran modules. The package comes with 
IPython notebook examples showing the use and features of the 
code. More in-depth tutorials covering exoplanet characterisation 
from transit light curves can be found from 


https://github.com/hpparvi/exo_tutorials 

again implemented as IPython notebooks. 

Interpolation can be used to speed up both of the models, and 
the performance gain is especially significant for the MA model. 
The errors introduced by the interpolation are small, but systematic, 
and error evaluation is recommended if extreme precision (~ 10“^ 
for the MA model, ~ 10“^ for the G model) is required. The maxi¬ 
mum deviations for the interpolated MA mode occur at the end of 
ingress and the beginning of egress. 

The two transit models have different advantages: the 
quadratic Mandel-Agol model is significantly faster than the two- 
coefficient Gimenez model, but the Gimenez model allows for 
higher-order limb darkening. Other Mandel-Agol models with 
higher-order limb darkening may be implemented in the future, but 
the flexibility of the Gimenez model makes this a low-priority task. 


The code has been developed and tested over several years. 


and has been used in|Parviainen et al. 1 2014 i;|Murgas et al. (2014 1 ; 

Gandolfi et al. P015 1 ; |Tingley et al.|(2014 

; IParviainen et a . 

(2013 

i; Gandolfi et al. (|2013J;|Murgas et al. 

2012 1 ; Rouan et a . 

(2012 

i; and Tingley et al.|( 2011 1 . 
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Mandel-Agol model, quadratic LD 



Gimenez model, quadratic LD 



Figure 4. Model evaluation times for the Intel setup as a function of threads 
relative to the evaluation time of the respective model with one thread. Con¬ 
tinuous lines show the results for the direct models, and slashed lines for the 
interpolated models. 
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